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Abstract 

Convergence of the ensemble Kalman filter in the limit for large ensembles 
to the Kalman filter is proved. In each step of the filter, convergence of 
the ensemble sample covariance follows from a weak law of large numbers for 
exchangeable random variables, Slutsky's theorem gives weak convergence of 
ensemble members, and LP bounds on the ensemble then give LP convergence. 
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1. Introduction 

Data assimilation, a topic of importance in many disciplines, uses statistical 
estimation to update the state of a running model based on new data. One 
of the most succesful recent data assimilation methods is the ensemble Kalman 
filter (EnKF). EnKF is a Monte-Carlo approximation of the Kalman filter (KF), 
with the covariance in the KF replaced by the sample covariance computed from 
an ensemble of realizations. Because the EnKF does not need to maintain the 
state covariance matrix, it is suitable for high-dimensional problems. 

A large body of literature on the EnKF and variants exists, but rigorous 
probabilistic analysis is lacking. It is commonly assumed that the ensemble 
is a sample (that is, i.i.d.) and it is normally distributed. Although the 
resulting analyses played an important role in the development of EnKF, both 
assumptions are false. The ensemble covariance is computed from all ensemble 
members together, thus introducing dependence, and the EnKF formula is 
a nonlinear function of the ensemble, thus destroying the normality of the 
ensemble distribution. 

The present analysis does not employ these two assumptions. The ensemble 
members are shown to be exchangeable random variables bounded in L p , which 
provides properties that replace independence and normality. An argument 
using uniform integrability and Slutsky's theorem is then possible. The 
result is valid for the EnKF version of Burgers, van Leeuven, and Evensen 
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( Burgers et all Il998l ) in the case of constant state space dimension, a linear 



model, normal data likelihood and initial state distributions, and ensemble size 
going to infinity. This EnKF version involves r andomizat i on of data. Efficien t 
variants of EnKF without randomization exist ( Ander son. 1999; iTippett et al 
20031) . but they are not the subject of this paper 



The analysis in IBurgers et al. ( 1998t ) consists of the comparison of the 
covariance of the analysis ensemble and the covariance of the filtering 
distribution under the as sumption that the ensemble c ovariance converges in the 
limit for large ensembles. iFurrer and Bengtsson ( 20071 ) note that if the ensemble 
sample covariance is a consistent estimator, then Slutsky's theorem yields the 
convergence in probability of the gain matrix. When this article was being 
completed, we became aware of a presentation by Le Gland which announces 
related results but does not seem to take advantage of exchangeability. 



2. Preliminaries 



The Euclidean norm of column vectors in R m , m > 1, and the induced 
matrix norm are denoted by || • ||, and T is the transpose. The (stochastic) L p 
norm of random element is ||-X"||p = (-E(|| A"|| p )) 1 / p . The j-tb entry of a vector 
X is [X]j and the i,j entry of a matrix Y £ Jj mxn [ s [Y]ij .Weak convergence 
(convergence in distribution) is denoted by =>•; weak convergence to a constant 
is the same as convergence in probability. All convergence is for N — > oo. We 
denote by Xn — [-Xjvi]£Li — [-Xjvi, • • • , Xnn]i with various superscripts and 
for various m > 1, an ensemble of N random elements, called members, with 
values in W n . Thus, an ensemble is a random m x N matrix with the ensemble 
members as columns. Given two ensembles Xn and Yjv> tli e stacked ensemble 
[Xjv; Ijv] is defined as the block random matrix 



[X N ;Y N ] = 



Xn 






Xni 




Xnn 










Y N1 _ 


, . . . , 


Ynn 





If all the members of Xn are identically distributed, we write E(Xni) and 
Gov(Xni) for their common mean vector and covariance matrix. The ensemble 
sample mean and ensemble sample covariance matrix are the random elements 

Xn = J2iLi Xni and C(Xn) = X^Xjj — XnX n . 

We will work with ensembles such that the joint distribution of the ensemble 
Xn is invariant under a permutation of the ensemble members. Such ensemble 
is called exchangeable. An ensemble Xn is exchangeable if and only if 
Vr(X N G B) = Pr^n 6 B) for every Borel set B C R mxN and every 
permutation matrix II G M. NxN . The covariance between any two members 
of an exchangeable ensemble is the same, Cov(A"at;, Xnj) — Cov(Xni, Xn?), 
i + 3- 

Lemma 1. Suppose Xn and Dn are exchangeable, the random elements 
Xn and Dn are independent, and Yni = F(Xn, -Xjvij ^jv»)j * = 1 , - ■ ■ , ^V, 
where F is measurable and permutation invariant in the first argument, i.e. 
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F(XnII, Xn%, Dni) = F(Xn, .Xjvi) Dpfi) for any permutation matrix II. Then 
Y/v is exchangeable. 



PROOF. Write Y N = F(X N ,D N ), where 

F(Xn,Dn) = [F(Xn,Xni,Dni), ■ ■ ■ , F^ 1 *' (X^, Xnni Dnn)]- 

Let II be a permutation matrix. Then YjvII = F(XjvII, D^TT). Because Xn 
is exchangeable, the distributions of Xn and X^H are identical. Similarly, 
the distributions of Dn and DpjTl are identical. Since Xn and Dn 
are independent, the joint distributions of (Xn,Dn) and (XpfU, D^U) are 
identical. Thus, for any Borel set B C R nxJV , Pr(YjvII e B) = E(l B (Y N Il)) = 
E(1 B {F{X N 11, D N U))) = E(1 B (F(X N , D N ))) = Pr(X N £ B). □ 

We now prove a weak law of large numbers for exchangeable ensembles. 

Lemma 2. If for all N, Xn , Un are ensembles ofM 1 valued random variables, 
[Xn;Un] is exchangeable, Cov(£/jVi) t/jVj) = for all i ^ j, Uni £ L 2 is the 
same for all N , and Xni — ► Uni in L 2 , then Xn =>■ E(Uni). 

Proof. Since Xn is exchangeable, Gav{Xm,Xnj) = Cov(Xni,Xn2) for all 
i,j = 1, . . . , N, i j. Since Xm — Un is exchangeable, also Xn2 — Un2 — > 
in L 2 . Then, using the identity Cov(X, Y) = E(XY) - E{X)E(Y) and 
Cauchy inequality for the L 2 inner product E(XY), we have | Cov(-Xjvi, Xn2) — 
Cov(U m ,U N2 )\ < 2||Xj V1 || 2 ||X A r 2 - U N 2h + 2||E7' W a||a||.Xm - U m \\ 2 , so 
Cov(Xjvi, Xn2) => 0. By the same argument, Va.r(XNi) =>- Vax(f7jvi) < +oo. 
Now E(X N ) = E{X N i) => E(Um) from X N1 - U N1 -> in L 2 , and 
Var(X N ) = ^ Y^Li Var(X w ) + E5=i,, W Cov(A0v*, Xjv;) = £ Var(AOvi) + 
(1 — -k) Cov(Xni, Xn 2 ) 0, and the conclusion follows from Chebyshev 
inequality. □ 

The convergence of the ensemble sample covariance for nearly i.i.d. 
exchangeable ensembles follows. 

Lemma 3. If for all N , Xn , Un are ensembles ofM. n valued random elements, 
[Xn',Un] is exchangeable, Un are i.i.d., Uni £ L A is the same for all N, and 
Xm — > Uni i n L 4 , then X N E(Uni) and C(X N ) Cov(E/jvi)- 

Proof. From Lemma [21 it follows that [Xn]] => [E(UNi)]j for each entry 
j = so Xn => E(Uni). Let Yni = XNiX Ni , so that C(Xn) 

= Y n — X nX N . Each entry of [Yj\ri]# = [Xni] j [Xm]t satisfies the assumptions 
of Lemma [3J so [Yjvi]# =*- ^([^vi^jvib'^)- Convergence of the entries 
[X N X N ]ji = [X n ] 3 [X_n]i to E([UNi]u)E([UNi]j t) follows from the alrea dy 
proved convergence of Xn and Slutsky's theorem ( Chow and Teicherl . Il997l p. 



254). Applying Slutsky's theorem again, we get C(Xn) => Cov(Uni). □ 
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3. Formulation of the EnKF 

Consider an initial state given as the random variable U^°\ In step k, 
the state £/( fe -i) j s advanced in time by applying the model to obtain 

jj{k)j = ca iied the prior or the forecast, with probability density 

function (pdf) PuW.f- The data in step k are given as measurements with 
a known error distribution, and expressed as the data likelihood p(d^ \ u). The 
new state conditional on the data, called the posterior or the analysis, then 
has the density pjjw given by the Bayes theorem, pjjw (u) oc p(d^ k ' \u)pjj(,k)j (u), 
where oc means proportional. This is the discrete time filtering problem. The 
distribution of is called the filtering distribution. 

Assume U (0) ~ N(vP>, QP>), the model is linear, : u h-» A w u + b {k \ 

and the data likelihood is normal, d^ ~ N(H^u (k ^ f , RW) given u y hJ , 
where is the given observation matrix and is the given data error 

covariance, and the data error is independent of the model state. Then the 
filtering di stribution is normal, ~ N (vf- k \ Qw) , and it satisfies the KF 

recursions ([Anderson and Moorel . ll979h 

«(*)./ = E(uW>t) = A^u^+b^ k \ = Cov = A^Q^A^, 

u {k) = u (k),f + L (k) {d {k) _ H (k) u (k)j^ Q {k) = (J _ L (k) H (k) )Q (k)j^ 

where the Kalman gain matrix is given by 

L« = Q(k),f H (k)T^ H (k) Q (k),f H (k)T + fi (fe))-l. (1) 

The EnKF is essentially based on the following observation. Let ~ 

iV(u(°),Q(°)) and L>f } ~ N(d^,R^) be independent for all k,i > 1. Given 
N, choose the initial ensemble and the perturbed data as the the first N terms of 
the respective sequence, U^j — U^°\ i — 1,...,N, — Z)j , i — 1,...,N, 

(k) 

k = 1,2,... Define the ensembles U N by applying the KF formulas to each 
ensemble member separately using the corresponding member of perturbed data, 

U$> f = MW(U$r»), i=l,...,N, (2) 
UP = UP J + LW(DP - HWujP' f ). (3) 

(k) 

The next lemma shows that U N is a sample from the filtering distribution. 
Lemma 4. For all k = 1, 2, . . U% is Lid. and U$_ ~ N(u (k \ QW). 
Proof. The statement is true for k = by definition of llff. Assume that it is 

(k) 

true for fc — 1 in place of k. The ensemble U^' is i.i.d. and normally distributed 
because it is an image under a linear map of the normally distributed i.i.d. 
ensemble with members [f7^ , D^\], i — 1,...,N. Further, D$ and U^}^ 
are independent, so from [Burgers et alJ (|l998l . eq. (15) and (16)), has the 

(k) 

correct mean and covariance, which determines the normal distribution of U N { 
uniquely. □ 



4 



The EnKF is now obtained by replacing the exact covariance by the 
ensemble sample covariance. The ensembles produced by EnKF are xff = ujfp 
and 

MW^"), i=l,...,JV. (4) 
^^^W + JfW^'-ffWx^), (5) 
where is the ensemble sample gain matrix, 

= QPH^ T (H^QPH^ T + rW)~\ Qf = C(X^' f ). (6) 

4. Convergence analysis 

Lemma 5. There exist constants c(k,p) for all k and all p < oo such that 
P^IIp < c(fc,p) and H-K^llp < c{k,p) for all N. 

PROOF. For k = 0, each is normal. Assume HXjyj -1 ^ ||p < c(k — l,p) for 
all iV. Then 

II^IIp = \\A^X^+b( k % < + < const(fc,p). 

By Jensen's inequality, for any X N , \\^J2iLi X N l \\ P < jj YnLi II-^vJp- This 
gives ||X^ )J || P < const(fc,p) and \\Q^\\ P < \\X$ J X%l^\\ p + \\X$' f \\* < 
\\Xfti |||p + II^jvi lip — C0DSt(fe,p), since from Cauchy inequality, 



\WZ\L <E(\\W 



| p \\Zf)> < E(\\W\\ 2p )^E(\\Z\\ 2p )^ = \\W\\ 2p \\Z\\ 2p , (7) 



for any compatible random matrices W and Z . 

Since is symmetric positive semidefinite and R^ k ' i 



is 

symmetric positive definite, it holds that||(ifW Q$ + R^)' 1 ]] < 
< const(fc), which, together with the bound on \\Q£ \\ p , gives 
\\kP\\ p < ||Q^ ) || p const(fc) < const(fc,p). Finally, we obtain the desired bound 

\\x ( N % < \\x%1>% + \\kWd$\\ p + \\kWhWx$% 

< constCfc^XllJf^Hp + II^Hp + H^llapll^^llap) < <k,p), 
using again (J7J). □ 

Theorem 1. For all k, [Xn\Un] is exchangeable and X^\ — > in L p for 
all p < +oo. 

PROOF. The ensembles are obtained by linear mapping of the i.i.d. 

initial ensemble U$\ so they are i.i.d. Since X^j — U^j, [X^';U^] is 
exchangeable, and X/vi = Umi- Suppose the statement holds for k — 1 in 
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place of k. The ensemble members are given by a recursion of the form 
= F^\C{X^- l \[X^ ] U^r l \D^ i ). The ensemble sample 
covariance matrix C is permutation invariant, so [X^ ; t/j^] is exchangeable by 
Lemma[U Subtracting © and © gives X^ )J '~U^ )J = {X^^ -C/^ _1) ), 
and xffi'f and t/jy satisfy the assumption of Lemma[31 Thus, C(Xjlf ) => 
Cov U^l J , => by the mapping theorem (|Billingslevl . 1 19951 p. 334), 



and X N { f7^ by Slutsky's theorem. Since for all p < +00, the sequence 
{^jvi}jv = i i s bounded in i p by Lemma [5] and t/J^ £ L p , it follows that 



(k) (k) 

X AT -( — > t/jy-f in L p for all p < +00 from uniform integrability (Billingsley, 



199a p. 338). □ 



Using Lemma [3] and uniform integrability again, it follows that the ensemble 
mean and covariance are consistent estimators of the filtering mean and 
covariance. 

Corollary 1. X^ -> «( fc ) and C{X^ Y ) -> QW in D> for all p < +00. 
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